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ABSTRACT 

Identification of student learning behaviors, especially those 
that characterize or distinguish students, can yield impor- 
tant insights for the design of adaptation and feedback mech- 
anisms in Intelligent Tutoring Systems (ITS). In this paper, 
we analyze trace data to identify distinguishing patterns of 
behavior in a study of 51 college students learning about a 
complex science topic with an agent-based ITS that fosters 
self-regulated learning (SRL). Preliminary analysis with an 
Expectation-Maximization clustering algorithm revealed the 
existence of three distinct groups of students, distinguished 
by their test and quiz scores (low for the first group, medium 
for the second group, and high for the third group), their 
learning gains (low, medium, high), the frequency of their 
note-taking (rare, frequent, rare) and note-checking (rare, 
rare, frequent), the proportion of sub-goals attempted (low, 
low, high), and the time spent reading (high, high, low). In 
this paper, we extend this analysis to identify characteris- 
tic learning behaviors and strategies that distinguish these 
three groups of students. We employ a differential sequence 
mining technique to identify differentially frequent activity 
patterns between the student groups and interpret these pat- 
terns in terms of relevant learning behaviors. The results of 
this analysis reveal that high-performing students tend to be 
better at quickly identifying the relevance of a page to their 
subgoal, are more methodical in their exploration of the ped- 
agogical content, rely on system prompts to take notes and 
summarize, and are more strategic in their preparation for 
the post-test ( e.g ., using the end of their session to briefly 
review pages). These results provide a first step in identify- 
ing the group to which a student belongs during the learning 
session, thus making possible a real-time adaptation of the 
system. 

1. INTRODUCTION 

Use of metacognition and self-regulated processes has been 
identified as a key element for successful learning in gen- 
eral [?;?;?;?]. In the particular context of an intelligent 
tutoring system (ITS), it means it is crucial to ensure that 
students are actively using key self-regulated learning (SRL) 
processes, which can be achieved through prompts, scaffold- 
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ing, and feedback. A major challenge is to make the ITS 
more adaptive to individual learning characteristics, such as 
browsing behavior and initiative in performing appropriate 
SRL processes. 

Using MetaTutor, an agent-based ITS that fosters the use 
of SRL processes, we have collected a large amount of data 
from students interacting with the system while they were 
learning about the human circulatory system. In this paper, 
our goal is to answer two questions: (1) how can students 
be grouped according to their performance and their type of 
interaction with the system? and (2) how do specific learn- 
ing behaviors of high- and low-performing students differ, in 
particular regarding their use of SRL processes in MetaTu- 
tor? 

In this paper, we propose to answer the first question us- 
ing a clustering approach that groups students with similar 
performance and scores on other system interaction metrics. 
For the second question, we analyze members of the three 
clusters (especially comparing high- and low-performing stu- 
dents) with a differential sequence mining method [?], which 
identifies statistically significant differences in frequent be- 
haviors between clusters. 

This paper is organized as follows. In section 2, we start by 
discussing related work that combines clustering and pat- 
tern mining techniques for analysis of data from computer- 
based learning environments. In section 3, we introduce the 
ITS used for data collection, MetaTutor, as well as theoret- 
ical grounding of its key features, which encourage learners 
to perform self-regulation monitoring and strategy as they 
learn with the system. Section 4 describes the data col- 
lected and the relevant events encoded as actions, as well 
as the clustering performed to distinguish different types of 
students. Section 5 presents the principles of the method 
of differential sequence mining, its application to the data, 
and the results obtained in terms of patterns of actions that 
distinguish students from different clusters. Section 6 then 
discusses the practical implications of those findings in terms 
of potential modifications to the ITS, before concluding in 
section 7. 

2. RELATED WORK 

Analysis of trace log data from users’ interactions to better 
understand their learning process and distinguish groups of 
learners {e.g., efficient versus inefficient ones) has been an 
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important area of research in educational data mining. For 
example, Perera et al. [?] follow a 2-step methodology like 
ours, as they start by using a clustering algorithm (k-means) 
to identify strong groups of students collaborating in a soft- 
ware development task using an open environment (TRAC). 
The students are first clustered according to a set of at- 
tributes extracted a posteriori, and then they use a mod- 
ified version of the Generalized Sequential Pattern mining 
algorithm [?] to identify frequent sequences of actions that 
characterize the most successful groups. In [?], Romero et al. 
also use a combination of clustering and sequential pattern 
mining to identify different kinds of browsing behavior that 
students exhibit in their learning environment, “AHA!”, in 
order to provide them links to the most appropriate pages. 
With gStudy, Nesbit et al. [?] are interested in the use of 
self-regulation by students learning from multimedia docu- 
ments. They apply sequential pattern mining to find com- 
mon subsequences between groups of students, although they 
do not perform any clustering beforehand. Martinez et al. [?] 
pursue a similar approach and objective, as they aim to 
discover frequent sequences of actions that distinguish a 
group of students with high achievements from one with 
low achievements. They use a combination of pattern min- 
ing and clustering techniques to identify the most successful 
strategies in the context of a collaborative learning tool on 
a tabletop device. However, they first extract frequent pat- 
terns of actions and then cluster them in order to examine 
clusters of patterns associated with each group. Tang and 
McCalla [?] also use sequence mining and then clustering 
in their web learning environment, to facilitate instructional 
planning and diagnose students behaviors. 

3. METATUTOR ENVIRONMENT 
3.1 General overview 

MetaTutor is a multi-agent, adaptive hypermedia learning 
environment, which presents challenging human biology sci- 
ence content. The primary goal underlying this environment 
is to investigate how multi-agent system can adaptively scaf- 
fold SRL and metacognition within the context of learning 
about complex biological content. MetaTutor is grounded 
in a theory of SRL that views learning as an active, con- 
structive process whereby learners set goals for their learn- 
ing and then attempt to monitor, regulate, and control their 
cognitive and metacognitive processes in the service of those 
goals [?; ?; ?]. More specifically, MetaTutor is based on sev- 
eral theoretical assumptions of SRL that emphasize the role 
of cognitive, metacognitive (where metacognition is concep- 
tualized as being subsumed under SRL), motivational, and 
affective processes [?; ?]. Moreover, learners must regulate 
their cognitive and metacognitive processes in order to inte- 
grate multiple informational representations available from 
the system. While all students have the potential to regu- 
late, few students do so effectively, possibly due to inefficient 
or a lack of cognitive or metacognitive strategies, knowledge, 
or control. 

As a learning tool, MetaTutor has a multitude of features 
that embody and foster self-regulated learning ( cf . Fig- 
ure ??). These include four pedagogical agents which guide 
students through the learning session and prompt students 
to engage in planning, monitoring, and strategic learning 
behaviors. In addition, the agents can provide feedback 
and engage in a tutorial dialogue in an attempt to scaf- 


fold students’ selection of appropriate sub-goals, accuracy 
of metacognitive judgments, and use of particular learning 
strategies. The system also uses natural language processing 
to allow learners to express metacognitive monitoring and 
control processes. For example, learners can type that they 
do not understand a paragraph and can also use the inter- 
face to summarize a static illustration related to the circula- 
tory system. Additionally, MetaTutor collects information 
from user interactions with it to provide adaptive feedback 
on the deployment of students’ SRL behaviors. For exam- 
ple, students can be prompted to self-assess their under- 
standing (i.e., system-initiated judgment of learning [ J OL] ) 
and are then administered a brief quiz. Results from the 
self-assessment and quiz allow pedagogical agents to pro- 
vide adaptive feedback according to the calibration between 
students’ confidence of comprehension and their actual quiz 
performance. 

During learning, MetaTutor is capable of measuring the de- 
ployment of self-regulatory processes by allowing us to col- 
lect rich, multi-stream data, including: self-report measures 
of SRL, on-line measures of cognitive and metacognitive pro- 
cesses ( e.g ., concurrent think-alouds), dialogue moves re- 
garding agent-student interactions, natural language pro- 
cessing of help-seeking behavior, physiological measures of 
motivation and emotions, emerging patterns of effective prob- 
lem solving behaviors and strategies, facial data on both ba- 
sic ( e.g ., anger) and learning-centered emotions ( e.g ., bore- 
dom), and eye-tracking data regarding the selection, organi- 
zation, and integration of multiple representations of infor- 
mation (e.g., text, diagrams). The collection of these vari- 
ous data streams is critical to enhancing our understanding 
of when, how, and why students regulate or do not reg- 
ulate their learning and adapt their regulatory behaviors. 
These data are then used to develop computational models 
designed to detect, track, model, and foster students’ SRL 
processes during learning. 

3.2 Self-Regulated Learning with MetaTutor 

This paper is theoretically-guided by contemporary models 
of SRL that emphasize the temporal deployment of these 
processes during learning [?]. As such, the goal is to use 
multiple measures to detect, track, and model learners’ use 
of cognitive, affective, and metacognitive (CAM) processes 
during learning. As such, we use Winne and Hadwin’s model 
[?; ?] because it proposes that learning occurs in four ba- 
sic phases: (1) task definition, (2) goal-setting and plan- 
ning, (3) studying tactics, and (4) adaptations to metacog- 
nition. Their model emphasizes the role of metacognitive 
monitoring and control as the central aspects of learners’ 
ability to learn complex material across different instruc- 
tional contexts (e.g., using a multi-agent system to track 
and foster SRL) in that information is processed and ana- 
lyzed within each phase of the model. Recently, Azevedo 
and colleagues [?; ?; ?; ?; ?] extended this model and pro- 
vided extensive evidence regarding the role and function of 
several dozen CAM processes during learning with student- 
centered learning environments (e.g., multimedia, hyperme- 
dia, simulations, intelligent tutoring systems). 

In brief, our model makes the following assumptions: (1) 
successful learning involves having learners monitor and con- 
trol (regulate) key CAM processes during learning; (2) SRL 
is context-specific and therefore successful learning may re- 
quire a learner to increase/decrease the use of certain key 
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□ MetaTutor (version 1.2.2) 1 
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Learning Goal and Subgoals 

Your goal is to learn all you can about the Circulatory System. Specifically, be sure to learn about all the different organs and other components of the 
circulatory system, and their purpose within the system, how they work both individually and together, and how they support the healthy functioning of 
the body; 
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your current suDgoais are 
Heartbeat 

Complete Subooal 

Blood vessels 

Q Prioritize Subooal 

Malfunctions of the circulatory system 

1 [ Prioritize StAqoal | 


Heart Valves 


See Contents in Full View 


Heart: Heart Valves 


Heart valves are thin, fibrous flaps found at the entrance 
and exits of the ventricles. Each valve opens easily in the 
direction of blood flow and when blood pushes against 
the valves in the opposite direction the valves close. This 
ensures that blood flows unidirectionally. The tricuspid 
valve is found between the right atrium and ventricle, 
whereas the bicuspid, also known as the mitral, valve is 
between the left atrium and ventricle. The exit valves are 
named after their destinations: the pulmonary valve, 
which prevents blood from returning from the lungs into 
the right ventricle, and the aortic valve, which prevents 
backflow from the aorta into the left ventricle. 

Because the tricuspid and bicuspid valves are found 
between the atria and ventricles, they are also known as 
atrioventricular (AV) valves. They are called tricuspid and 
bicuspid because they are made of three and two flaps 
respectively. 

The pulmonary and aortic valves are also known as 
semilunar valves because they are each made of three 
half-moon shaped flaps. 


Opening and closing of heart valves 




Learning Strategies 
I would like to: p I 


understand this 

Evaluate how well I 
already know this 

Evaluate how well this 
content matches my 
current subgoal 


| Show Interaction Log 


kJl ▼ StatusLog Pause Metatutor | 


Figure 1: Annotated screenshot of MetaTutor (A: time remaining in the session, B: table of contents, C: current subgoals and 
progression, D: embodied pedagogical agent, E: palette of monitoring and strategy actions) 


SRL processes at different points in time during learning; 
(3) a learner’s ability to monitor and control both inter- 
nal ( e.g ., prior knowledge) and external factors ( e.g ., chang- 
ing dynamics of the learning environment; relative utility of 
an agent’s prompt) are crucial in successful learning; (4) a 
learner’s ability to make adaptive, real-time adjustments to 
internal and external conditions, based on accurate judg- 
ments of their use of CAM processes, is fundamental to 
successful learning; and; (5) certain CAM processes (e.g., 
interest, self-efficacy, task value) are necessary to motivate 
a learner to engage and deploy appropriate CAM processes 
during learning and problem solving. This model is best 
suited for this project since it deals specifically with the 
person-in-context perspective and postulates that CAM pro- 
cesses occur during learning with a multi-agent system, which 
will be useful in examining when and how learners will reg- 
ulate their learning about the human circulatory system. 
As such, the macro-level processes used in this paper are 
reading, metacognitive monitoring, and learning strategies. 
Reading behavior is critical since it is the most important 
activity related to acquiring, comprehending, and using con- 
tent knowledge related to the science topic. During reading, 
learners need to monitor and regulate several key processes 
such as: (1) selecting relevant content ( i.e ., text and dia- 
grams) based on their current sub-goal; (2) spending appro- 
priate amounts of time on each page, depending on their rel- 
evance regarding their current sub-goal; (3) deciding when 
to switch or create a new sub-goal; (4) making accurate 
assessments of their emerging understanding; (5) conceptu- 
ally connecting content with prior knowledge; (6) adaptively 
selecting, using, and assessing the effective use of several 
learning strategies including re-reading, coordinating infor- 
mational sources, summarizing, making inferences, in order 
to comprehend the material at various levels (i.e., declara- 
tive, procedural, and conceptual knowledge); and, (7) mak- 


ing adaptive changes to behavior based on a variety of exter- 
nal (e.g., quiz scores, quality and timing of agents’ prompts 
and feedback) and internal sources (e.g., affective experi- 
ences including both positive and negative affective states, 
perception of task difficulty). In sum, SRL involves the con- 
tinuous monitoring and regulation of CAM processes during 
learning with MetaTutor. 

3.3 Participants and data collection 

While data has been collected over a sample of 148 un- 
dergraduate students from two large public universities in 
North America, we consider for this study only a sub-sample 
of 51 participants from the experimental condition that in- 
cluded the most prompts from the pedagogical agents to 
perform SRL actions and in which students were given some 
adaptive feedback after having performed those actions. Par- 
ticipants from other conditions did not receive a similar ex- 
perience with the system, and therefore the values of the 
variables considered (cf. section ??) were completely differ- 
ent for them (e.g. they took less quizzes as they were not 
prompted to self-regulate their learning). Considered logs 
contained an average of 1072 events per session (a = 255). 

4. PRELIMINARY STEPS 

4.1 Data preparation, coding and extraction 

For the analysis performed here, as justified in section ??, we 
abstracted the set of collected interactions into three broad 
categories: reading, monitoring, and strategy (cf. Table ?? 
for The detailed list of actions extracted from the data). 

4.1.1 Reading 

A reading action (Read) is coded each time the student clicks 
to display a new page of content to read. They can be split 
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according to two combinatorial criteria, r and t, written as 
Readl, where: 

• r stands for the relevance of the page with regard to 
the student’s current subgoal (+ for a relevant page, 
— for an irrelevant page, 0 if no subgoal is currently 
set and relevance can’t be determined); 

• t stands for the time the student spent reading the page 
(s if they remain less than 15 seconds, threshold under 
which no SRL prompt can be triggered, l otherwise). 

4.1.2 Monitoring 

A monitoring action (Mon) is coded when the student per- 
forms, or is prompted to perform, a monitoring action with 
respect to their learning. This monitoring action could be 
a judgment of learning (JOL) about what they have just 
read, a feeling of knowing (FOK) regarding the content of 
the page, an evaluation of the content (CE) relevance with 
respect to their current subgoal, or an assessment of their 
progress towards their current subgoal (MPTG). They can 
also be split according to two combinatorial criteria, e and 
i, written as Mon 3 , where: 

• e £ {+,—,0} stands for the correctness of the moni- 
toring evaluation performed by the student (+ if the 
evaluation is right, — if it is wrong, 0 if no direct eval- 
uation is possible for the monitoring process); 

• i £ {-u, a} stands for the initiator of the action (u for 
the user, a for the agent). 

Following FOKs and JOLs, as well as when the student 
claims to have finished a subgoal, students are asked to an- 
swer a short quiz (of 3 to 10 questions). Those actions, 
coded as Quiz, can be split along one dimension and are 
then written Quiz 3 , where s £ {+, — } stands for the success 
or failure to pass the test (+ if the student obtained more 
than 66% of correct answers, — otherwise). 

4.1.3 Strategy 

A strategy action ( Str ) is coded when the student uses a 
strategy to self-regulate their learning, including when the 
strategy is prompted by the agent, as well as when the user 
independently decides to perform the action. Strategy ac- 
tions include a summarization (SUMM) of the page, a coor- 
dination of information sources (COIS) by viewing a related 
image, an inference (INF) regarding the reading material, 
a re-reading (RR) of a paragraph that was not well under- 
stood, or notes taken about the reading material. This ac- 
tion can also be split depending on the initiator of the action, 
and is then written Str v where i £ {u, a} as defined in 4.1.2. 
Moreover, we distinguish a particular strategy consisting of 
taking or checking notes in the embedded note interface or 
using the electronic paper-based notepad provided next to 
the workstation. These note actions are coded as Notes. 

4.2 User clustering 

4.2.1 Methodology 

In a previous study [?], we ran a cluster analysis over a 
subset of 13 variables extracted from the interaction log af- 
ter the end of the student’s learning session: pretest and 
posttest score, number of subgoal and page quizzes, mean 


Table 2: Synthesis of clusters differences (italic means clus- 
ters weren’t significantly different from one another accord- 
ing to that variable when using an ANOVA with p < 0.05) 


Variables 

Score for each cluster 1 

0 

1 

2 

Pretest score 

M 

L 

H 

Posttest score 

M 

L 

H 

Session duration 

M 

M 

M 

Reading duration 

H 

H 

L 

Proportion of subgoals 
attempted 

L 

L 

H 

Number of subgoals 
changes 

M 

L 

H 

Number of subgoals 
quizzes 

M 

M 

M 

Mean first score in 
subgoal quizzes 

M 

L 

H 

Number of page quizzes 

M 

M 

M 

Mean first score in page 
quizzes 

M 

L 

H 

Number of note taking 

H 

L 

L 

Number of note 
checking 

L 

L 

H 

Time spent taking notes 

H 

L 

L 


first score in subgoal and page quizzes, proportion of sub- 
goals attempted among the 7 possible, number of subgoals 
changes, total session duration, time spent reading content, 
number of times the student took notes and checked notes, 
and the duration of the note-taking episodes. This analy- 
sis empoyed the Expectation-Maximization (EM) algorithm 
as implemented in the Weka data mining package [?]. The 
number of categories to find being undetermined a priori, 
we used a 10-fold cross-validation, during which we incre- 
mented the number of clusters (starting with 1) as long as 
the loglikelihood averaged over the 10 folds was increasing 
(i.e. we stopped as soon as the loglikelihood with N+l clus- 
ters was lower than with N clusters). We used 1000 different 
initialization seeds for the EM algorithm, in order to com- 
pensate for its tendency to get stuck into local optima, and 
selected, among the 1000 partitions of students generated, 
the most frequent one among the most frequently obtained 
number of clusters (3). 

4.2.2 Results 

Three clusters were obtained, which characteristics are sum- 
marized in Table ??, where clusters 0, 1 and 2 are made of 
21, 14 and 16 students, respectively. Generally, students 
from cluster 2 scored high on pretest, posttest and inter- 
mediary quizzes, spent less time than others reading while 
attempting more subgoals, and took less notes and less time 
taking them. In contrast, students from cluster 1 scored low 
on pretest, posttest and intermediary quizzes, attempted 
less subgoals and took few notes and less time to take them. 
Students from cluster 0 occupied generally a intermediate 
position in terms of performance and subgoal uses, but took 
overall more notes and more time to take them. When using 
a formula derived from [?] to evaluate learning gains (cf. [?] 
for more details), we also found that students from cluster 2 
had the most significant knowledge acquisition, as opposed 
to those in cluster 1. For all those reasons, cluster 1 will 
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Table 1: List of actions extracted from MetaTutor interaction logs 


Category 

Action 

name 

Description 

Read 

Readi 
ReadJ 
Readf 
Readf 
Read ^ 
Readf 

Student skims through a page relevant for their current subgoal for less than 15s 
Student skims through a page irrelevant for their current subgoal for less than 15s 
Student skims through a page without having a subgoal set for less than 15s 
Student reads a page relevant for their current subgoal for more than 15s 
Student reads a page irrelevant for their current subgoal for more than 15s 
Student reads a page without having a subgoal set for more than 15s 

Monitoring 

Moria 

Mon a 

Monf 

Morif 

MoUu 

Monf 
Quiz + 
Quiz~ 

Student is prompted to evaluate their knowledge, learning or the relevance of the content they 
are reading, and evaluates correctly 

Student is prompted to evaluate their knowledge, learning or the relevance of the content they 
are reading, and is wrong in their evaluation 

Student is prompted to perform a monitoring action that doesn’t require an evaluation 
Student takes the initiative of evaluating their knowledge, learning or the relevance of the content 
they are reading, and evaluates correctly 

Student takes the initiative of evaluating their knowledge, learning or the relevance of the content 
they are reading, and is wrong in their evaluation 

Student takes the initiative of performing a monitoring action that doesn’t require an evaluation 
Student passes a page or subgoal quiz (more than 66% of correct answers) 

Student fails a page or subgoal quiz (less than 66% of correct answers) 

Strategy 

Str a 

Str u 

Notes 

Student is prompted to deploy a strategy to self-regulate 
Student takes the initiative of using a strategy to self-regulate 

Student takes or checks notes using the embedded interface or a paper-based electronic notepad 


be referred to as cluster L (for low), cluster 2 as cluster H 
(for high) and cluster 0 as cluster M (for medium). The fact 
that exactly three (as opposed to any other number) clus- 
ters were extracted might sound unsurprising, but comes 
from the fact that it was the best partition of the subjects 
in the 13-dimension space considered. 

5. DIFFERENTIAL SEQUENCE MINING 
5.1 Method principles 

To identify important activity patterns in a comparison be- 
tween student clusters, we employ a differential sequence 
mining technique [?]. This technique uses sequence mining 
and two different measures of pattern frequency to identify 
differentially frequent patterns between two sets of action 
sequences. Differential sequence mining combines frequency 
measures and techniques from sequential pattern mining [?], 
which determines the most frequent action patterns across 
a set of action sequences, and episode mining [?], which de- 
termines the most frequently used action patterns within a 
given sequence. 

The sequential pattern mining frequency measure (i.e., how 
many sequences/students exhibit the given pattern) is used 
to identify patterns common to a group of students. We refer 
to this as the “sequence support” ( s-support ) of the pattern, 
and we call patterns meeting a given s-support threshold 
s-frequent. In this analysis, we employ an s-support thresh- 
old of 0.5 to focus on patterns exhibited by at least half 
of a given group of students. The episode mining frequency 
(i.e., the frequency with which the pattern is repeated within 
an action sequence) is important for assessing the extent to 
which a student relies on a particular pattern of activities. 
For a given student, we refer to this as the “instance sup- 
port” (i- support), and we call patterns meeting a given i- 
support threshold i-frequent. To calculate the i-support of 
a pattern for a group of students, we use the mean of the 


pattern’s i-support values across all traces in the group. 

The differential sequence mining technique first uses a se- 
quential pattern mining algorithm to identify the patterns 
that meet a minimum s-support constraint within each group 
[?] . To compare the identified frequent patterns across groups, 
we calculate the i-support of each pattern for each student 
(in each group). Using a t-test, we filter the s-frequent pat- 
terns to identify those for which there is a statistically signif- 
icant difference in i-support values between groups. Com- 
paring the mean i-support value for each pattern between 
groups then allows us to focus the comparison on patterns 
that are employed significantly more often by one group than 
the other. 

This comparison produces four distinct categories of fre- 
quent patterns: two categories where the patterns are s- 
frequent in only one group, illustrating patterns primarily 
employed by the respective groups, and two categories where 
the patterns are common to both groups but used signifi- 
cantly more often in one group than the other. The patterns 
in each of these qualitatively distinct categories are (sepa- 
rately) sorted by the difference in mean group i-support 1 to 
focus the analysis on the most differentially frequent pat- 
terns [?]. 

5.2 Application to the data 

In order to identify patterns more closely related to changes 
in students’ knowledge and understanding, we decided to fo- 
cus mainly on clusters H and L, as defined in section 4.2.2. 
Moreover, to further identify the patterns most character- 
istic of students in cluster H ( resp . L) we identified dif- 
ferentially frequent patterns with respect to the other two 

1 Even though a pattern may not be s-frequent in a group 
of action sequences, it can still occur in some sequences in 
the group, so an i-support value can be calculated (or the 
i-support is 0 if the pattern does not occur in any trace in 
the group). 
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clusters M and L ( resp . M and H) in a secondary analysis. 
We employed an s-support threshold of 50% in this analy- 
sis, to consider all the patterns that were exhibited by at 
least half of the students in a given cluster, and a standard 
value of 0.05 for the t-test cutoff p value. We tried to pre- 
liminarily group sequences of identical actions together, but 
the results obtained were not very different from the ones 
without grouping, as the data extracted do not display long 
sequences of similar actions - therefore, those results are 
not reported here. Similarly, although we also considered 
the possibility of using gaps of one or more actions when 
identifying patterns, we discarded this analysis because the 
frequency of events collected in the log is low, which means 
that even a gap of only one action could mean that two ac- 
tions of a pattern are actually separated by a rather long 
period of inactivity. 

5.3 Results 

The Table ?? displays the patterns with the highest differ- 
ence of S-support between clusters H and L (positive value 
in column 3) as well as between clusters L and H (negative 
value in column 3), provided that difference is statistically 
significant (i.e. a t-test p value below 0.05 in column 4). 
It also displays a selection of interesting patterns, which 
differed in a statistically significant way between the two 
clusters. Columns 6 to 11 provide the results obtained for 
that selection of patterns using two different samples of stu- 
dents: first (columns 6 to 8), cluster H alone and a merge 
of clusters L and M, and then (columns 9 to 11), cluster L 
alone and a merge of cluster H and M. Columns 5, 8 and 
11 show, for the two considered samples, if only one or both 
of them were having a s-support above 50% for the consid- 
ered pattern. Values N /A are used when the pattern is non 
statistically significant for the two considered samples. 

The following observations can be made: 

- According to pattern 1, when prompted to use a strategy 
(regardless of the one suggested by the agent), students in 
cluster H reacted by taking notes more often than students 
in cluster L. We already knew that students in cluster H had 
received significantly more prompts from the system, and 
taken less notes overall than those in cluster L (but checked 
them more often). This pattern seems to suggest that the 
reason might be that the notes they were taking mainly came 
from prompts from the agents. Moreover, since when they 
type a summary, students are offered the possibility to add 
it to their notes, it appears that students from cluster H 
must have preferred that strategy, which also would explain 
why they spent less time with the note-taking interface open 
(since the summary is typed in a different text box, and the 
note-taking interface is opened only to add the already typed 
text). Finally, the fact that the difference for this pattern is 
significant for cluster H vs. L, H vs. M&L and H&M vs. L 
indicate that the degree to which one relies on the prompt 
for notes or summaries to take notes is directly correlated to 
the belonging to one of the three clusters (i.e. this behavior 
is observed more in cluster H than in M, and more in M 
than in L). 

Similarly, pattern 3 indicates that after a note-taking event, 
students from cluster H often moved on to another relevant 
page, which they read for an extended period. Pattern 5, 
which is a combination of patterns 1 and 3, confirm the idea 
that students from cluster H had a very methodical approach 
to navigating through the content: they selected a relevant 


page, read it until being prompted by the agent to take notes 
or summarize it, performed that action, and then moved on 
to a new relevant page. Incidentally, it also indicates their 
effectiveness in identifying a page relevant to their current 
subgoal simply from its title (since that is all they can see 
before opening it). This latter hypothesis is itself reinforced 
by the observation that patterns 10 and 11, relative to a 
brief visit on an irrelevant page or to a succession of brief 
visits to irrelevant pages, is characteristic of students from 
clusters M and L, as opposed to students from cluster H who 
seem to not even need to open the pages to figure out they 
are irrelevant to their current subgoal. 

- Pattern 2 simply confirms what we already knew about the 
tendency of student in cluster H to have answered correctly 
more often to intermediate quizzes (for a page or a subgoal) . 
It also significantly distinguish members of cluster H from 
those in cluster M&L considered together. 

- Patterns i & 7 are relative to pages viewed when the 
students did not have any active subgoal set. Pattern 4 
indicates that students in the cluster H have visited more 
pages for a long time without having a subgoal set, which 
is confirmed by pattern 7 which also indicate an alternation 
between short and long reads when no subgoals were set. As 
we also know that students from cluster H attempted more 
subgoals overall than students in the cluster L, it cannot 
mean that they have simply refused to set additional sub- 
goals once they had finished their original ones (e.g., in an 
attempt to get rid from the system prompts and feedback), 
but rather that: a) they might have spent some time review- 
ing pages already read before taking the posttest, and/or b) 
instead of setting a final subgoal when they did not have 
much time left, they took some time to review the pages 
they had not yet explored. 

This hypothesis can be confirmed by looking at the tempo- 
ral distribution of those two patterns: for students in cluster 
H, the median time is of 108 and 112 minutes (for an overall 
session of approximately 120 minutes), which means that 
it’s during the last 15 minutes of their learning session that 
students were displaying that kind of browsing behavior, 
clearly distinct from the ones they had displayed earlier in 
the session. 

Pattern 6 indicates that students in cluster H seemed to 
more often estimate properly their level of understanding of 
the content or the relevance of the page they were visiting 
when it was relevant for their current subgoal. While this 
pattern is only marginally significant when comparing clus- 
ters H and L, it is statistically significant when comparing H 
to M&L, confirming that it is specific of students in cluster 
H. It tends to show that not only other students had diffi- 
culties to identify the relevance of a page from its title, but 
that even once they had been able to spend some time read- 
ing its content, they were less prone to correctly evaluate its 
relevance or their understanding of it. 

This hypothesis seems to be confirmed by the complemen- 
tary pattern 8, which indicates that students from cluster L, 
when they were on a page irrelevant for their subgoal for a 
long time and got prompted to evaluate its relevance (the 
only prompt they can get on a non-relevant page), tended 
to be wrong in their evaluation. 

If we consider again the temporal distribution of those two 
patterns, we can notice that the median time, for students in 
cluster L, is of 50 and 45 minutes, i.e. less than the median 
time of the session (60 minutes). We can therefore assume 
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Table 3: Significant and most frequent patterns differentiating clusters 


# 

Pattern 

Cluster H vs. L 

Cluster H vs. M&L 

Cluster H&M vs. L | 

I-Supp. 

Diff 

t-test 
(p value) 

S-Freq. 

Cluster 

I-Supp. 

Diff 

t-test 
(p value) 

S-Freq. 

Cluster 

I-Supp. 

Diff 

t-test 
(p value) 

S-Freq. 

Cluster 

1 

Str a -< Notes 

3.93 

0.002 

Both 

3.28 

0.005 

Both 

2.30 

0.007 

Both 

2 

Quiz + 

3.10 

0.036 

Both 

2.09 

0.046 

Both 

2.30 

0.086 

Both 

3 

Notes -*< Read + 

2.86 

0.004 

Both 

2.35 

0.012 

Both 

1.71 

0.012 

Both 

4 

Read f 

2.63 

0.039 

H 

2.27 

0.050 

H 

1.48 

0.107 

H&M 

5 

Str a -< Notes -< Read ’J*" 

2.38 

0.001 

Both 

N/A 

N/A 

N/A 

1.27 

0.017 

Both 

6 

Read+ -< Moria 

1.96 

0.065 

Both 

1.96 

0.048 

Both 

0.85 

0.304 

Both 

7 

Readf -< Read ® 

1.33 

0.050 

H 

1.23 

0.061 

H 

N/A 

N/A 

N/A 


8 

Read^ -< Moria 

-0.54 

0.039 

L 

N/A 

N/A 

N/A 

-0.25 

0.360 

L 

9 

Read -< Read ~ 

-0.65 

0.012 

L 

N/A 

N/A 

N/A 

-0.53 

0.038 

L 

10 

ReadJ -< ReadJ 

N/A 

N/A 

N/A 

-1.77 

0.030 

M&L 

N/A 

N/A 

N/A 

11 

ReadJ 

-3.49 

0.149 

Both 

-2.56 

0.036 

Both 

-2.39 

0.321 

Both 


that, at least, students from cluster L have been slightly 
improving their capacity to evaluate their learning and the 
relevance of a page over time. 

- Pattern 9 confirms the previous observation that students 
in cluster L really had issues to see the relevance of a page 
with regard to their subgoal: they did not simply end up 
going to random pages that were irrelevant to their subgoal, 
or ignored the subgoal they had set, but instead, they ap- 
peared to sometimes skim through a relevant page, miss its 
relevance, and end up instead spending a long time on a 
page that wasn’t irrelevant to their subgoal. This tendency 
is shared, to some extent, with students from cluster M, 
as the results of clusters H vs. M&L are also statistically 
significant. 

- A final observation can be made regarding the tendency 
of a student to obey system prompts: if we run the same 
analysis without distinguishing the correctness of the eval- 
uation of students monitoring ( i.e . by considering actions 
Mon a = Ad on a U A/on“ and Adon u = Mon+ U Mon^), we 
observe that the pattern Mon a ~t Mon u is significantly more 
frequent for students in cluster H, which tends to indicate 
that when prompted to perform an optional monitoring ac- 
tion (most likely, a MPTG, since otherwise there should be 
a Quiz action following the Mon a ), they are more prone to 
accomplish the suggested action. 

6. DISCUSSION 

To summarize the results obtained in the previous section, 
we can conclude that students from cluster H are more in- 
clined to follow the system prompts and to follow the sug- 
gestions to take notes or summarize what they have just 
learned. Further, they are more prone to keep applying the 
same method for each page they read, are better at iden- 
tifying a page relevant to their subgoal from its title, and 
are more strategic in their preparation for the posttest ( e.g ., 
they usually use their last 10 to 15 minutes to briefly review 
various pages). From an ITS design point of view, the fact 
that these students used system prompts to effectively reg- 
ulate their learning tends to indicate that the frequency of 
Strategy prompts should probably not be reduced. However, 
as they seem good at distinguishing relevant pages from ir- 
relevant ones, they might need less scaffolding regarding the 
Monitoring processes. On the other hand, students from 
cluster L appear particularly unable to identify pages rele- 


vant to their subgoal, which is probably linked to their lower 
prior knowledge. For them, it seems that additional scaffold- 
ing from the system would certainly be beneficial. However, 
even when prompted to monitor their learning, they tend to 
be mistaken in their evaluation. Therefore, it could be nec- 
essary to go further than the methods currently employed to 
suggest ways in which they can better evaluate the relevance 
of a page. 

7. CONCLUSION, FUTURE DIRECTIONS 

In this paper, we have presented a two-step analysis of data 
collected with an ITS designed to foster self-regulated learn- 
ing. First, the clustering of students using Expectation Max- 
imization has allowed us to distinguish three clusters of stu- 
dents with different prior knowledge on the topic, learning 
performance, and strategies. We then described a set of ac- 
tions extracted from the system interaction trace log and 
employed a sequence mining technique to identify differen- 
tially frequent activity patterns. We used the identified pat- 
terns to characterize students from different clusters with 
particular emphasis on those that had the highest and the 
lowest learning gains. We have been able to identify patterns 
of actions that suggest students with high prior knowledge 
and high learning gains tended to be more compliant with 
system prompts, using them to validate their progression. 
Further, these students were better at identifying pages rel- 
evant to their subgoals from the page title and tended to 
have a phase at the end of the session during which they 
reviewed the content in preparation for the posttest. 

The analysis performed here will allow us to more accu- 
rately identify the group to which a student belongs during 
their use of MetaTutor and dynamically adapt the scaffold- 
ing and feedback mechanisms accordingly. Another future 
research direction will involve the use of other channels of 
data collected while students use MetaTutor (eye-tracking 
information, affective data extracted from video captures, 
and think-aloud data) in order to enhance our identification 
and understanding of phases when low-performing students 
are unable to properly monitor their learning. 
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